Discourse Chunking and its Application to Sentence Compression
نویسندگان
چکیده
In this paper we consider the problem of analysing sentence-level discourse structure. We introduce discourse chunking (i.e., the identification of intra-sentential nucleus and satellite spans) as an alternative to full-scale discourse parsing. Our experiments show that the proposed modelling approach yields results comparable to state-of-the-art while exploiting knowledge-lean features and small amounts of discourse annotations. We also demonstrate how discourse chunking can be successfully applied to a sentence compression task.
منابع مشابه
Global inference for sentence compression : an integer linear programming approach
In this thesis we develop models for sentence compression. This text rewriting task has recently attracted a lot of attention due to its relevance for applications (e.g., summarisation) and simple formulation by means of word deletion. Previous models for sentence compression have been inherently local and thus fail to capture the long range dependencies and complex interactions involved in tex...
متن کاملA Unified Framework for Discourse Argument Identification via Shallow Semantic Parsing
This paper deals with Discourse Argument Identification (DAI) from both intra-sentence and inter-sentence perspectives. For intra-sentence cases, we approach it via a simplified shallow semantic parsing framework, which recasts the discourse connective as the predicate and its scope into several constituents as the argument of the predicate. Different from state-of-the-art chunking approaches, ...
متن کاملDiscourse Segmentation for Sentence Compression
Earlier studies have raised the possibility of summarizing at the level of the sentence. This simplification should help in adapting textual content in a limited space. Therefore, sentence compression is an important resource for automatic summarization systems. However, there are few studies that consider sentence-level discourse segmentation for compression task; to our knowledge, none in Spa...
متن کاملDiscursive Sentence Compression
This paper presents a method for automatic summarization by deleting intra-sentence discourse segments. First, each sentence is divided into elementary discourse units and, then, less informative segments are deleted. To analyze the results, we have set up an annotation campaign, thanks to which we have found interesting aspects regarding the elimination of discourse segments as an alternative ...
متن کاملPlagiarism Detection and Document Chunking Methods
This paper describes the tests made on chunking methods used for plagiarism detection. The result of the tests makes it possible to decide on the best fitting chunking method for a given application. For example, overlapping word chunking is good for a grammar analyzer or for small databases, sentence chunking suits best for finding quoted texts, hashed breakpoint chunking is the fastest method...
متن کامل